Skip to content

[core] Create event log files lazily to avoid stale/empty logs#64366

Open
nuocbiendang12 wants to merge 2 commits into
ray-project:masterfrom
nnmt2810:fix/64153-lazy-log-creation
Open

[core] Create event log files lazily to avoid stale/empty logs#64366
nuocbiendang12 wants to merge 2 commits into
ray-project:masterfrom
nnmt2810:fix/64153-lazy-log-creation

Conversation

@nuocbiendang12

@nuocbiendang12 nuocbiendang12 commented Jun 26, 2026

Copy link
Copy Markdown

Why are these changes needed?

Fixes #64153
Ray currently creates several event log files eagerly at process startup,
even when they are never written to:

  • logs/events/event_CORE_WORKER_<pid>.log — always empty
  • logs/events/event_GCS.log — usually empty
  • logs/events/event_RAYLET.log — usually empty
  • logs/export_events/*.log — empty when new export framework is enabled
    This creates unnecessary file clutter and wastes I/O on every Ray startup.

What do these changes do?

Changes LogEventReporter in src/ray/util/event.cc to use lazy
initialization for the underlying log file. The log file is now only
created/opened on the first call to Report(), not in the constructor.
Key changes:

  • Added EnsureSinkInitialized() private helper method
  • Store log_sink_key_ as a member variable for deferred initialization
  • Modified Flush() to safely handle uninitialized sink
  • All existing behavior is preserved when events are actually written

Related issues/PRs

Closes #64153

Checks

  • I've signed off all commits with git commit -s
  • I've run scripts/format.sh to lint the changes in Python
  • I've run clang-format on C++ changes
  • All existing tests pass locally
  • Added/modified tests to cover the changes (if applicable)
    -## Contributors
  • Lê Thanh Châu (MSSV: 19120463) - 19120463@student.hcmus.edu.vn
  • Nguyễn Ngọc Minh Tuấn (MSSV: 23120102) - 23120102@student.hcmus.edu.vn

@nuocbiendang12 nuocbiendang12 requested a review from a team as a code owner June 26, 2026 05:20

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces lazy initialization for the LogEventReporter's spdlog logger (log_sink_) to prevent empty log files from being created at startup. The review feedback highlights a critical thread-safety issue: because EnsureSinkInitialized() can be called concurrently from multiple threads, the unsynchronized check and initialization of log_sink_ can lead to data races and crashes. The reviewer recommends using std::call_once with a std::once_flag to ensure thread-safe initialization.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/ray/util/event.h
Comment on lines +148 to 149
// The underlying spdlog logger. nullptr until the first event is written.
std::shared_ptr<spdlog::logger> log_sink_;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To support thread-safe lazy initialization of log_sink_ using std::call_once, we need to declare a std::once_flag member variable.

  // The underlying spdlog logger. nullptr until the first event is written.
  std::shared_ptr<spdlog::logger> log_sink_;

  // Ensure thread-safe lazy initialization of log_sink_.
  std::once_flag sink_init_once_;

Comment thread src/ray/util/event.cc
Comment on lines +86 to 101
void LogEventReporter::EnsureSinkInitialized() {
if (log_sink_ != nullptr) {
return;
}
log_sink_ = spdlog::get(log_sink_key_);
// If the file size is over {rotate_max_file_size_} MB, this file would be renamed
// for example event_GCS.0.log, event_GCS.1.log, event_GCS.2.log ...
// We allow to rotate for {rotate_max_file_num_} times.
if (log_sink_ == nullptr) {
log_sink_ = spdlog::rotating_logger_mt(log_sink_key,
log_sink_ = spdlog::rotating_logger_mt(log_sink_key_,
log_dir_ + file_name_,
1048576 * rotate_max_file_size_,
rotate_max_file_num_);
}
log_sink_->set_pattern("%v");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In a multi-threaded environment, EventManager::Publish can be called concurrently from multiple threads because it only acquires a reader lock (absl::ReaderMutexLock). This means LogEventReporter::Report (and consequently EnsureSinkInitialized) can be executed concurrently by multiple threads.

Without synchronization, the check if (log_sink_ != nullptr) and the subsequent initialization of log_sink_ (which is a std::shared_ptr) is a data race, leading to undefined behavior. Furthermore, concurrent calls to spdlog::rotating_logger_mt with the same key will throw a spdlog_ex exception, causing the process to crash.

To resolve this, we should use std::call_once with a std::once_flag to ensure that log_sink_ is initialized safely and exactly once.

void LogEventReporter::EnsureSinkInitialized() {
  std::call_once(sink_init_once_, [this]() {
    log_sink_ = spdlog::get(log_sink_key_);
    // If the file size is over {rotate_max_file_size_} MB, this file would be renamed
    // for example event_GCS.0.log, event_GCS.1.log, event_GCS.2.log ...
    // We allow to rotate for {rotate_max_file_num_} times.
    if (log_sink_ == nullptr) {
      log_sink_ = spdlog::rotating_logger_mt(log_sink_key_,
                                             log_dir_ + file_name_,
                                             1048576 * rotate_max_file_size_,
                                             rotate_max_file_num_);
    }
    log_sink_->set_pattern("%v");
  });
}

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit ea2243f. Configure here.

Comment thread src/ray/util/event.cc Outdated
Comment thread src/ray/util/event.h
@ray-gardener ray-gardener Bot added core Issues that should be addressed in Ray Core community-contribution Contributed by the community labels Jun 26, 2026
@nuocbiendang12 nuocbiendang12 force-pushed the fix/64153-lazy-log-creation branch 3 times, most recently from 21370c9 to 7790675 Compare June 28, 2026 04:33
Fixes ray-project#64153

Previously, LogEventReporter created event log files eagerly in its
constructor, causing empty log files to appear on every Ray startup
even when no events were ever written:
- logs/events/event_CORE_WORKER_<pid>.log (always empty)
- logs/events/event_GCS.log (usually empty)
- logs/events/event_RAYLET.log (usually empty)
- logs/export_events/*.log (empty when export framework enabled)

This change defers log file creation to the first call to Report()
or ReportExportEvent() via a new EnsureSinkInitialized() helper.
If no events are written, no log file is created on disk.

Changes:
- Added EnsureSinkInitialized() private helper method that creates
  the spdlog rotating_logger_mt only on first use
- Added log_sink_key_ member field to store the key for deferred init
- Modified constructor to only compute and store the file name/path,
  without opening the file
- Modified Flush() to safely handle an uninitialized (nullptr) sink
- Modified Report() and ReportExportEvent() to call EnsureSinkInitialized()
  before writing

All existing behavior is preserved when events are actually written.

Signed-off-by: Lê Thanh Châu <19120463@student.hcmus.edu.vn>
Signed-off-by: Lê Thanh Châu <19120463@student.hcmus.edu.vn>
@nuocbiendang12 nuocbiendang12 force-pushed the fix/64153-lazy-log-creation branch from 7790675 to f680daf Compare June 28, 2026 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Core] Stale/empty logs always being created

2 participants